========================================================
I chose to explore the financial contributions from Georgia to the election campaigns. What differences can we find between Rebublican and Democratic supporters?
setwd('~/Documents/datasets')
ga_obama <-read.csv('P80003338-GA.csv')
The first file I downloaded only had 31 observations for one candidate, Ted Cruz. I went back to the website to try and understand what data was available. I then downloaded a dataset for Georgia contributions to President Obama’s 2012 campaign. After trying to load the csv file, I got an error indicating that the first column does not have unique row names. Checked the documentation on www.fec.gov and saw that “tran_id” transaction id is supposed to be unique for the dataset. Moved that to first column and then was able to read in the file.
str(ga_obama)
## 'data.frame': 96601 obs. of 18 variables:
## $ tran_id : Factor w/ 96601 levels "C10968531","C10968533",..: 1798 1153 1314 1079 1015 1620 1437 1128 2003 1290 ...
## $ cmte_id : Factor w/ 1 level "C00431445": 1 1 1 1 1 1 1 1 1 1 ...
## $ cand_id : Factor w/ 1 level "P80003338": 1 1 1 1 1 1 1 1 1 1 ...
## $ cand_nm : Factor w/ 1 level "Obama, Barack": 1 1 1 1 1 1 1 1 1 1 ...
## $ contbr_nm : Factor w/ 19006 levels "AARON, BILLYE S",..: 2971 12501 13429 9836 10432 3207 14808 6476 5631 18533 ...
## $ contbr_city : Factor w/ 573 levels "A","AATLANTA",..: 32 32 42 42 32 45 32 32 289 196 ...
## $ contbr_st : Factor w/ 1 level "GA": 1 1 1 1 1 1 1 1 1 1 ...
## $ contbr_zip : int 30311 303083347 30168 30168 303191018 300021562 303443231 303094147 300461264 302158012 ...
## $ contbr_employer : Factor w/ 8304 levels "","(SELF) ROBERTSON'S DECORATING CENTER I",..: 5982 3859 5334 5982 5982 5982 3859 6389 2277 5982 ...
## $ contbr_occupation: Factor w/ 5170 levels "","_","(R) RT",..: 4390 2227 1734 3923 3923 3923 3923 1460 346 4733 ...
## $ contb_receipt_amt: num 15 250 25 112 25 50 50 250 50 10 ...
## $ contb_receipt_dt : Factor w/ 613 levels "1-Apr-12","1-Aug-11",..: 202 389 556 291 88 612 345 369 400 495 ...
## $ receipt_desc : Factor w/ 2 levels "","Refund": 1 1 1 1 1 1 1 1 1 1 ...
## $ memo_cd : Factor w/ 2 levels "","X": 1 1 1 1 1 1 1 1 1 1 ...
## $ memo_text : Factor w/ 22 levels "","*","* EARMARKED CONTRIBUTION: SEE BELOW",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ form_tp : Factor w/ 3 levels "SA17A","SA18",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ file_num : int 756218 756218 756218 756218 756218 756218 756218 756218 756218 756218 ...
## $ election_tp : Factor w/ 5 levels "G2008","G2012",..: 5 5 5 5 5 5 5 5 5 5 ...
summary(ga_obama)
## tran_id cmte_id cand_id
## C10968531: 1 C00431445:96601 P80003338:96601
## C10968533: 1
## C10970191: 1
## C10970429: 1
## C10971061: 1
## C10972400: 1
## (Other) :96595
## cand_nm contbr_nm contbr_city
## Obama, Barack:96601 CORNUTT, SUSAN : 159 ATLANTA :27543
## SOUTHERLAND, LINDA: 150 DECATUR : 5777
## SHASH, AMIR : 144 MARIETTA : 4323
## LAMB, ALYSE : 131 ATHENS : 2521
## CAUTHEN, GEORGE : 119 STONE MOUNTAIN: 2366
## GOODELL, JANNAH : 118 ALPHARETTA : 2351
## (Other) :95780 (Other) :51720
## contbr_st contbr_zip contbr_employer
## GA:96601 Min. : 40 RETIRED :17454
## 1st Qu.: 30312 SELF-EMPLOYED : 8410
## Median :300346461 NOT EMPLOYED : 8299
## Mean :170280434 INFORMATION REQUESTED: 2615
## 3rd Qu.:303095111 EMORY UNIVERSITY : 1104
## Max. :912142739 (Other) :58713
## NA's : 6
## contbr_occupation contb_receipt_amt contb_receipt_dt
## RETIRED :18751 Min. :-5000.00 17-Oct-12: 3153
## ATTORNEY : 3337 1st Qu.: 19.00 2-Nov-12 : 2758
## PHYSICIAN : 2518 Median : 35.00 31-Aug-12: 1986
## INFORMATION REQUESTED: 2371 Mean : 99.27 31-Oct-12: 1882
## HOMEMAKER : 2051 3rd Qu.: 100.00 23-Oct-12: 1836
## (Other) :67572 Max. : 5000.00 28-Sep-12: 1662
## NA's : 1 (Other) :83324
## receipt_desc memo_cd
## :95852 :77835
## Refund: 749 X:18766
##
##
##
##
##
## memo_text form_tp
## :77711 SA17A:77102
## * OBAMA VICTORY FUND 2012 :18638 SA18 :18750
## * : 122 SB28A: 749
## * EARMARKED CONTRIBUTION: SEE BELOW : 82
## EXCESSIVE CONTRIBUTION REFUNDED OCT. 2012: 11
## EXCESSIVE CONTRIB. REFUNDED SEPT. 2012 : 9
## (Other) : 28
## file_num election_tp
## Min. :756214 G2008: 3
## 1st Qu.:810684 G2012:52501
## Median :821325 O2012: 89
## Mean :820035 P2008: 8
## 3rd Qu.:840327 P2012:44000
## Max. :853328
##
names(ga_obama)
## [1] "tran_id" "cmte_id" "cand_id"
## [4] "cand_nm" "contbr_nm" "contbr_city"
## [7] "contbr_st" "contbr_zip" "contbr_employer"
## [10] "contbr_occupation" "contb_receipt_amt" "contb_receipt_dt"
## [13] "receipt_desc" "memo_cd" "memo_text"
## [16] "form_tp" "file_num" "election_tp"
library(ggplot2)
First, I wanted to understand the number of people donating to President Obama’s campaign at the different amount levels.
qplot(x=contb_receipt_amt, data = ga_obama)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
qplot(x=contb_receipt_amt, data = ga_obama, binwidth=10, xlab="Amount of Contribution",ylab="Number of contributors") +
scale_x_continuous(limits=c(0,2000), breaks=seq(0,2000,100))
I noticed that most people contributed less than $150- narrowed scale down to see this more closely:
qplot(x=contb_receipt_amt, data = ga_obama, binwidth=10, xlab="Amount of Contribution",ylab="Number of contributors") +
scale_x_continuous(limits=c(0,275), breaks=seq(0,275,25))
qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = ga_obama,
xlab="Amount of Contribution",
ylab="Proportion of users who contributed that amount",
geom='freqpoly')
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
scale_x_continuous()
## continuous_scale(aesthetics = c("x", "xmin", "xmax", "xend",
## "xintercept"), scale_name = "position_c", palette = identity,
## expand = expand, guide = "none")
qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = ga_obama,
xlab="Amount of Contribution",
ylab="Proportion of users who contributed that amount",
geom='freqpoly') +
scale_x_continuous(limits=c(-500,500), breaks=seq(-500,500, 100))
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 2 rows containing missing values (geom_path).
These last two plots really show that the majority of President Obama’s supporters contributed at less than $300 levels. But how does the Republican candidate compare? Downloaded the contributions from Georgia to 2012 Mitt Romney campaign from the FEC website.
setwd('~/Documents/datasets')
ga_mitt <-read.csv('P80003353-GA.csv')
str(ga_mitt)
## 'data.frame': 52891 obs. of 18 variables:
## $ tran_id : Factor w/ 30037 levels "SA17.1000013",..: 13262 13128 13606 13793 13666 13423 12913 13723 13789 13722 ...
## $ cand_id : Factor w/ 1 level "P80003353": 1 1 1 1 1 1 1 1 1 1 ...
## $ cand_nm : Factor w/ 1 level "Romney, Mitt": 1 1 1 1 1 1 1 1 1 1 ...
## $ contbr_nm : Factor w/ 14511 levels "21ST CENTURY MAJORITY FUND",..: 545 546 557 1688 2659 2672 2679 594 1714 1758 ...
## $ contbr_city : Factor w/ 525 levels "ABBEVILLE","ACCATUR",..: 29 29 135 29 29 29 29 29 13 13 ...
## $ contbr_st : Factor w/ 1 level "GA": 1 1 1 1 1 1 1 1 1 1 ...
## $ contbr_zip : int 303062621 303051038 307220624 303054018 303424408 303264229 303051352 303395362 300044546 300225180 ...
## $ contbr_employer : Factor w/ 6313 levels "","(RETIRED BANKER)",..: 265 4739 4739 1036 596 1365 1368 3879 2495 2755 ...
## $ contbr_occupation: Factor w/ 2637 levels "","11B","A-320 INSTRUCTOR PILOT",..: 136 2000 2000 2529 1632 1749 277 136 1171 1088 ...
## $ contb_receipt_amt: num 250 2500 1000 250 2000 1000 2500 1000 2500 2500 ...
## $ contb_receipt_dt : Factor w/ 511 levels "1-Aug-11","1-Aug-12",..: 134 454 8 390 80 267 354 249 390 249 ...
## $ receipt_desc : Factor w/ 12 levels "","ATTRIBUTION TO PARTNERS REQUESTED / REDESIGNATION REQUESTED",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ memo_cd : Factor w/ 2 levels "","X": 1 1 1 1 1 1 1 1 1 1 ...
## $ memo_text : Factor w/ 51 levels "","ATTRIBUTION TO PARTNERS REQUESTED",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ form_tp : Factor w/ 3 levels "SA17A","SA18",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ file_num : int 760248 760248 760248 760248 760248 760248 760248 760248 760248 760248 ...
## $ tran_id.1 : Factor w/ 30037 levels "SA17.1000013",..: 13262 13128 13606 13793 13666 13423 12913 13723 13789 13722 ...
## $ election_tp : Factor w/ 2 levels "G2012","P2012": 2 2 2 2 2 2 2 2 2 2 ...
qplot(x=contb_receipt_amt, data = ga_mitt, binwidth=10, xlab="Amount of Contribution",ylab="Mitt Romney") +
scale_x_continuous(limits=c(0,3000), breaks=seq(0,3000,500))
First, note that there are 96601 contributions from Georgia to Present Obama’s campaign, and only 52891 contributions to Mitt Romney’s campaign from Georgia. Let’s compare the number of contributions at different levels, with same scales;
q1 <- qplot(x=contb_receipt_amt, data = ga_mitt, binwidth=10, xlab="Amount of Contribution",ylab="Mitt Romney") +
scale_x_continuous(limits=c(0,3000), breaks=seq(0,3000,500))
q2 <- qplot(x=contb_receipt_amt, data = ga_obama, binwidth=10, xlab="Amount of Contribution",ylab="President Obama") +
scale_x_continuous(limits=c(0,3000), breaks=seq(0,3000,500))
library(gridExtra)
## Loading required package: grid
grid.arrange(q1,q2, ncol=1)
Comparing these two graphs, it is easy to see how many more people contributed to President Obama’s campaign then Mitt Romney’s. Let’s now compare the distribution of amounts proportionately:
q3 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = ga_obama,
xlab="Amount of Contribution",
ylab="President Obama",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
q4 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = ga_mitt,
xlab="Amount of Contribution",
ylab="Mitt Romney",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
grid.arrange(q3,q4, ncol=1)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
#g <- arrangeGrob(q3,q4, ncol=1)
#ggsave(file="../P3/compare_amounts.pdf", g) #saves g
One can see that the bulk of President Obama’s supporters contributed less than $100, while contributions to Mitt Romney’s campaigns had spikes around $500, $100 and $2500.
Next, let’s take a look at contribution amounts per city.
qplot(contbr_city,contb_receipt_amt, data = ga_obama)
Immediately noticeable in this first city vs contribution plot is that there are negative contributions plotted. What does this mean? I went back to website to understand. Noticed that there was an attribute “receipt desc”. Plotted that with the amount contributed. Almost all of the negative contributions had a description of “Refund” while the positive contributions had a blank refund description, which explained the negative amounts. Since these contributions don’t represent actual contributions, will leave these data points out of my dataset.
qplot(contb_receipt_amt,receipt_desc, data = ga_obama)
ga_obama_positive = subset(ga_obama, contb_receipt_amt >0)
qplot(contbr_city,contb_receipt_amt, data = ga_obama_positive)
This scatterplot is very hard to read, as the individual cities are not labeled and the data points are clumped together on the x-axis. You can see that most of the amounts were under $500, with clear lines at $1000, $1500, $2000, and $2500. Going to build a new data set containing averages to see more details.
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
contbr_city_groups <- group_by(ga_obama_positive, contbr_city)
ga_obama.contrib_by_city <- summarise(contbr_city_groups,
contrib_mean = mean(contb_receipt_amt),
contrib_median = median(contb_receipt_amt),
n = n())
ga_obama.contrib_by_city <- arrange(ga_obama.contrib_by_city, contbr_city)
head(ga_obama.contrib_by_city,30)
## Source: local data frame [30 x 4]
##
## contbr_city contrib_mean contrib_median n
## 1 A 175.00000 175 2
## 2 AATLANTA 5.00000 5 1
## 3 ACWORTH 73.16497 35 591
## 4 ADAIRSVILLE 95.14286 25 14
## 5 ADEL 116.60870 100 23
## 6 AILEY 533.33333 300 3
## 7 ALAPAHA 50.16000 50 25
## 8 ALBANY 101.40471 50 433
## 9 ALBERTON 150.00000 100 3
## 10 ALEXANDRIA 95.00000 100 5
## .. ... ... ... ...
ggplot(aes(x=contbr_city, y = contrib_mean),data = ga_obama.contrib_by_city ) +
geom_point() + scale_x_discrete()
This plot is easier to read, but I would like to make it wider and have all the different cities listed for comparison. Lets get a set of cities to look at that have n (number of contributors) > 200. Also rotated city labels so they could be read.
ggplot(aes(x=contbr_city, y = contrib_mean),data = subset(ga_obama.contrib_by_city, n>200) ) +
geom_point() + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
Adding color, n >200 for readability.
ggplot(aes(x=contbr_city, y = contrib_mean),data = subset(ga_obama.contrib_by_city, n>200) ) +
geom_point(color="blue") + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
### Now onto the Republicans! Went back to the FEC website and found the Georgia contributions to Mitt Romney in 2012. Made the same adjustments using the transaction id and read the file in.
qplot(contbr_city,contb_receipt_amt, data = ga_mitt)
Again, noticed negative contributions, filtered those out. Also need to group by city and take average as done for the Obama data.
qplot(contbr_city,contb_receipt_amt, data = ga_mitt)
ga_mitt_positive = subset(ga_mitt, contb_receipt_amt >0)
qplot(contbr_city,contb_receipt_amt, data = ga_mitt_positive)
mitt_contbr_city_groups <- group_by(ga_mitt_positive, contbr_city)
ga_mitt.contrib_by_city <- summarise(mitt_contbr_city_groups,
contrib_mean = mean(contb_receipt_amt),
contrib_median = median(contb_receipt_amt),
n = n())
ga_mitt.contrib_by_city <- arrange(ga_mitt.contrib_by_city, contbr_city)
ggplot(aes(x=contbr_city, y = contrib_mean),data = subset(ga_mitt.contrib_by_city, n>200) ) +
geom_point(color="red") + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
Plot both together, and compare city with donation amount by party. Added red color for Mitt Romney contributions
library(gridExtra)
obamap1 <- ggplot(aes(x=contbr_city, y = contrib_mean),data = subset(ga_obama.contrib_by_city, n>200) ) +
geom_point(color="blue") + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
mittp2 <- ggplot(aes(x=contbr_city, y = contrib_mean),data = subset(ga_mitt.contrib_by_city, n>200) ) +
geom_point(color="red") + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
grid.arrange(obamap1,mittp2, ncol=1)
Can easily see that average contributions by city to Mitt Romney’s campaign are much greater than the average contributions by city to President Obama’s campaign. Also, you can see that the number of cities that have more than 200 contributors is greater for President Oabam then Mitt Romney, which matches our finding above for overall number of contributors. I would like to see this on one plot for easier comparison. Created two new data sets and binded them together using rbind, then plotted on same axes.
visual1= data.frame(subset(ga_obama.contrib_by_city, n>200))
visual2= data.frame(subset(ga_mitt.contrib_by_city, n>200))
visual1$group <- "obama"
visual2$group <- "mitt"
visual12 <- rbind(visual1, visual2)
ggplot(visual12, aes(x=contbr_city, y=contrib_mean, group=group, col=group, fill=group)) +
geom_point() + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) + scale_y_continuous(limits=c(0, 1500),breaks=seq(0,1500,200))
mean_contrib_city_both <- ggplot(visual12, aes(x=contbr_city, y=contrib_mean, group=group, col=group, fill=group)) +
geom_point() + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) + scale_y_continuous(limits=c(0, 1500),breaks=seq(0,1500,200))
#ggsave(mean_contrib_city_both,file="../P3/both.png",width=15,height=3)
Across the board, average Republican contributions per city are higher than Democratic ones.
Next, I would like to compare the contributions of retirees to both campaigns. How to get this data?
ga_obama$retired <- "N"
ga_obama$retired[grepl("RETIRED", ga_obama$contbr_occupation) == TRUE] <- "Y"
table(ga_obama$retired)
##
## N Y
## 76471 20130
In this table, there is an occupation “RETIRED” plus many occupations that contain the word “RETIRED”. Decided to create a new variable ‘retired’. 26.3% of Obama donors are retired.
ga_mitt$retired <- "N"
ga_mitt$retired[grepl("RETIRED", ga_mitt$contbr_occupation) == TRUE] <- "Y"
table(ga_mitt$retired)
##
## N Y
## 39741 13150
Looks like a slighty higher percentage, 33%, of Mitt Romney donors are retired. Let’s first look at contribution amounts for President Obama by retired vs working people.
ggplot(aes(x=contbr_city, y = contb_receipt_amt),data = ga_obama ) +
geom_point(aes(color=retired), stat='summary', fun.y=median) + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
This plot had too many city points to be useful, and city is not providing too much information in the analysis.
p1 <- qplot(x=contb_receipt_amt, data = subset(ga_obama,retired=="Y"), binwidth=10, xlab="Amount of Contribution",ylab="Number of retired contributors") +
scale_x_continuous(limits=c(0,2000), breaks=seq(0,2000,100))
p2 <- qplot(x=contb_receipt_amt, data = subset(ga_obama,retired=="N"), binwidth=10, xlab="Amount of Contribution",ylab="Number of workicontributors") +
scale_x_continuous(limits=c(0,2000), breaks=seq(0,2000,100))
grid.arrange(p1,p2, ncol=1)
p3 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = subset(ga_obama,retired=="Y"),
xlab="Amount of Contribution",
ylab="President Obama, Retired",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
p4 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = subset(ga_obama,retired=="N"),
xlab="Amount of Contribution",
ylab="President Obama, Working",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
grid.arrange(p3,p4, ncol=1)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
This last graph shows the contribution amount distribution was very similar between retired and non retired persons in President Obama’s campaign. Now let’s look at Mitt Romney’s campaign:
p5 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = subset(ga_mitt,retired=="Y"),
xlab="Amount of Contribution",
ylab="Mitt Romney, Retired",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
p6 <- qplot(x=contb_receipt_amt, y= ..count../sum(..count..), data = subset(ga_mitt,retired=="N"),
xlab="Amount of Contribution",
ylab="Mitt Romney, Working",
geom='freqpoly') +
scale_x_continuous(limits=c(-100,2600), breaks=seq(-100,2600, 200))
grid.arrange(p5,p6, ncol=1)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
This last graph shows the contribution amount distribution was similar between retired and non retired persons in Mitt Romney’s campaign, although more retired people were giving at the lower levels than working people.
Let’s finally analyze all occupations, not just retired/working. I am going to narrow this down to my town, Athens (home of University of Georgia). This gave us 2521 contributions.
ga_obama_athens= subset(ga_obama, contbr_city=="ATHENS")
library(dplyr)
occupation_groups <- group_by(ga_obama_athens, contbr_occupation)
ga_obama_athens.contrib_by_occ <- summarise(occupation_groups,
contrib_mean = mean(contb_receipt_amt),
contrib_median = median(contb_receipt_amt),
n = n())
ga_obama_athens.contrib_by_occ <- arrange(ga_obama_athens.contrib_by_occ, contbr_occupation)
ggplot(aes(x=contbr_occupation, y = contrib_mean),data = subset(ga_obama_athens.contrib_by_occ, n>10) ) +
geom_point(color="blue") + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
This last plot was very interesting to me. When I first narrowed the number of contributors to over 100 for an occupation, all that was plotted was “PROFESSOR” and “RETIRED”“, which is exactly the case when you live in a college town. I then let the number of contributors with that occupation go down to 5 to see all the different occupations. Many of these are associated with UGA. It was also interesting that the occupation of the highest amount was”HOMEMAKER“.
Athens, Georgia is known as a “blue” town in a “red” state. Overall in Georgia, 64.6 % of contributions when to Obama, lets see percentage in Athens:
ga_mitt_athens= subset(ga_mitt, contbr_city=="ATHENS")
This gives us only 456 contributions to Mitt Romney’s campaign from Athens. So for Athens, the percentage of contributions to Obama is 84.6% compared to 64.6% for Georgia overall.
Plot One
grid.arrange(q3,q4, ncol=1)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
Plot One Description This plot really shows the fact that contributions to President Obama’s campaign at lower levels than Mitt Romney.
Plot Two
ggplot(visual12, aes(x=contbr_city, y=contrib_mean, group=group, col=group, fill=group)) +
geom_point() + scale_x_discrete() + theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) + scale_y_continuous(limits=c(0, 1500),breaks=seq(0,1500,200))
Plot Two Description Across the board, average Republican contributions per city are higher than Democratic ones.
Plot Three
grid.arrange(p3,p4, ncol=1)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: Removed 3 rows containing missing values (geom_path).
Plot Three Description Retired persons gave at the same level as working persons in President Obama’s campaign.
I started with 96,601 contributions to President Obama and 52,891 contributions to Mitt Romney in the 2012 presidential campaign from Georgia. The data showed that although the contribution amounts were lower to President Obama’s campaign, more people contributed at these lower levels. I looked at this by city and by occupation status (retired or working).
Then, I looked at Athens to see the distibution of contributions across occupations. I also confirmed the idea that Athens is more Democratic then Republican compared to Georgia. I would be interested to compare contributions by factors as age, income level, religious affiliation, etc. I was wishing the dataset had some of these attributes.